Deep Reinforcement Learning-based Sum Rate Fairness Trade-off for Cell-Free mMIMO
نویسندگان
چکیده
The uplink of a cell-free massive multiple-input multiple-output with maximum-ratio combining (MRC) and zero-forcing (ZF) schemes are investigated. A power allocation optimization problem is considered, where two conflicting metrics, namely the sum rate fairness, jointly optimized. As there no closed-form expression for achievable in terms large scale-fading (LSF) components, fairness trade-off cannot be solved by using known convex methods. To alleviate this problem, we propose new approaches. For first approach, use-and-then-forget scheme utilized to derive rate. Then, iteratively through proposed sequential approximation (SCA) scheme. second exploit LSF coefficients as inputs twin delayed deep deterministic policy gradient (TD3), which efficiently solves non-convex problem. Next, complexity convergence properties analyzed. Numerical results demonstrate superiority approaches over conventional control algorithms minimum user both ZF MRC receivers. Moreover, TD3-based achieves better performance than SCA-based approach well fractional
منابع مشابه
Robust Zero-Sum Deep Reinforcement Learning
This paper presents a methodology for evaluating the sensitivity of deep reinforcement learning policies. This is important when agents are trained in a simulated environment and there is a need to quantify the sensitivity of such policies before exposing agents to the real world where it is hazardous to employ RL policies. In addition, we provide a framework, inspired by H∞ control theory, for...
متن کاملOperation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm
: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...
متن کاملOn-Policy vs. Off-Policy Updates for Deep Reinforcement Learning
Temporal-difference-based deep-reinforcement learning methods have typically been driven by off-policy, bootstrap Q-Learning updates. In this paper, we investigate the effects of using on-policy, Monte Carlo updates. Our empirical results show that for the DDPG algorithm in a continuous action space, mixing on-policy and off-policy update targets exhibits superior performance and stability comp...
متن کاملFairness in Reinforcement Learning
We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with ...
متن کاملVision-based Deep Reinforcement Learning
Recently, Google Deepmind showcased how Deep learning can be used in conjunction with existing Reinforcement Learning (RL) techniques to play Atari games[11], beat a world-class player [14] in the game of Go and solve complicated riddles [3]. Deep learning has been shown to be successful in extracting useful, nonlinear features from high-dimensional media such as images, text, video and audio [...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Vehicular Technology
سال: 2022
ISSN: ['0018-9545', '1939-9359']
DOI: https://doi.org/10.1109/tvt.2022.3230041